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Ordinary differential equations are widely-used in the field of systems biology and chemical engi- 
neering to model chemical reaction networks. Numerous techniques have been developed to estimate 
parameters like rate constants, initial conditions or steady state concentrations from time-resolved 
data. In contrast to this countable set of parameters, the estimation of entire courses of network 
components corresponds to an innumerable set of parameters. The approach presented in this work 
is able to deal with course estimation for extrinsic system inputs or intrinsic reactants, both not 
being constrained by the reaction network itself. Our method is based on variational calculus which 
is carried out analytically to derive an augmented system of differential equations including the 
unconstrained components as ordinary state variables. Finally, conventional parameter estimation 
is applied to the augmented system resulting in a combined estimation of courses and parameters. 
The combined estimation approach takes the uncertainty in input courses correctly into account. 
This leads to precise parameter estimates and correct confidence intervals. In particular this implies 
that small motifs of large reaction networks can be analysed independently of the rest. By the use 
of variational methods, elements from control theory and statistics are combined allowing for future 
transfer of methods between the two fields. 



INTRODUCTION 



Frequently, signalling pathways and chemical reaction 
networks in systems biology are modelled by ordinary 
differential equations (ODE). In many cases, the reac- 
tion networks are open systems comprising external in- 
puts like drug stimuli. The system is then modelled by a 
non-autonomous ODE. 

Similarly, modules of reaction networks are open systems. 
The nodes they have in common with the surrounding 
network are not or not entirely determined by the module 
species. They can be considered as intrinsic inputs and 
again the system can be modelled by a non-autonomous 
ODE. An example for such a cross-talk can be found in 

ID- 

While reaction rates and initial reactant concentrations 
form a countable set of parameters, inputs correspond to 
an innumerable set of parameters since in general, ev- 
ery function of time is possible as input and each func- 
tion value at each time point is a free parameter. Com- 
monly, if measurements for the inputs are available, non- 
parametric estimates like smoothing splines are employed 
to describe the input data [5] [3]. Given the input, an 
objective function depending on rate parameters and ini- 
tial values is defined and its minimum is approached by 
numerical optimization methods. In this way, the prob- 
lem of infinitely many parameters is avoided. As we will 
show, one problem associated to this approach is that it 
does not account for the uncertainty present in the in- 
put. As a consequence, estimated parameter confidence 
intervals do not cover the actual variability, i.e. they are 
too small. 

Therefore, it is preferable to parametrize the input which 



is possible if certain knowledge about origin and processes 
underlying inputs is available. This enables a reason- 
able choice of basis functions and the parametrization 
becomes finite. Following this approach, the problem of 
erroneous confidence intervals is circumvented presuming 
that the input model is correct. However, this assump- 
tion is problematic if only sparse information about the 
inputs and few measurement points are available. 
We propose to approach the problem of input 
parametrization by calculus of variations. In the Result 
section, the system's objective function used for ordinary 
parameter estimation is extended to a functional to be 
minimised. The original non-autonomous ODE is trans- 
formed into an augmented autonomous ODE. The result 
is interpreted and applied to simulated data. 

METHOD 

Derivation of the augmented ODE system 

In conventional parameter estimation, the objective 
function to be optimised is the likelihood function or 
the function. If a reaction network with species y^, 
fi = 1, . . . , n and reaction parameters pfe, k = 1, . . . , r, 
comprises inputs a;^, v = 1, . . . ,m, the dynamics of the 
system is described by the model 

yt,{t) ^ ff,{y{t),x{t),p), yp(0) = ?/p^o, (1) 

with dynamic variables and time-dependent input 
functions x,^(t), each of them collected in the vectors 
y G K" and x £ E™ . In the following, the dependence on 
the whole course of x will be emphasized by the notation 



2 



[x]. Furthermore, it is assumed that the input function 
x{t) is difFerentiable. Commonly used inputs Uke step 
functions or injections are rather distribution hke than 
differentiable functions. However, it is assumed that on 
the physiological level the acting input is more accurately 
described by a differentiable function. 
The objective function 



(2) 



penalizes distances between species measurements y^i 
and model prediction y^(ii, [xjjp, ?/o) at time points ti 
quadratically and weighted by the measurement uncer- 
tainties a n . In addition, input measurements x^, are 
compared with the input function values Xi,(ti). In par- 
ticular, is already a functional of [x]. In case of Gaus- 
sian noise, eq. ^ coincides with the maximum likelihood 
estimator. 

Our aim is to find a unique input function which min- 
imises the functional defined in eq. ([2|. To this purpose, 
we compute the first variation and check under which 
condition the first variation vanishes. See [1] for a general 
introduction to variational calculus as well as sections 1-2 
in the supplement . For the first variation we obtain 



ti 



=Syh 



+ res^iU) ■ h 



(3) 



The trajectory variation Syh is derived by eq. (jlj and 
is expressed by variation of constants: denotes the 
fundamental system of the homogeneous linear problem 
(j) = "SI yfcj) with the matrix Vy/ of first derivatives of / 
with respect to y and Vxfh constitutes the inhomogene- 
ity. Furthermore, a weighted residual function is defined 



-, analogously res^;^. For a de- 



tailed derivation see sections 2-5 in the supplement. 
Next, h needs to be separated. Similarly to Euler- 
Lagrange's equation |4], partial integration needs to be 
performed to extract h from the integral. However, there- 
fore the sum in eq. ^ needs to be extended to an inte- 
gral, all time-discrete measurement points and 
have to be replaced by continuous and differentiable data 
representations by means of a mapping S : C^(M) 
from N discrete values to a differentiable function. The 
resulting representations Sy^ (i), Sx^ (t) as well as S^^^ (t), 
(t) need to be defined at least on a finite interval 



[0, T] where T denotes the latest time point to be consid- 
ered. After partial integration, the first variation for the 
just defined time-continuous x^ functional reads 



\ ^ 



<I>*res,, dr +reSr -hdt. 



(4) 



with the auxiliary function u. The transpose is denoted 
by * . The integral, i.e. the first variation, vanishes for all 
choices of h if and only if the integrand is zero, leading to 
eq. ([5| . The auxiliary function u is equivalently expressed 
by its corresponding differential equation, eq. (|6|. Here, 
it is used that is a fundamental system for (j) — 
—Vyfcf) which follows from <I> being a fundamental system 
for (p = +'\/yf4>. Together with eq. iu we obtain: 



= V:,/*u + res^ 
ii ~ S/yf*u — resy 
y = f{y,x,p)- 



(5) 
(6) 
(7) 



The right-hand sides of eqs. ([6|7| depend on state 
variables y, u, and x, the latter being constrained by 
eq. ([5]). Particularly, if the input enters linearly in 
the dynamics of the reaction network, Vxf is inde- 
pendent of X and eq. ([s]) can be directly solved for 
X, i.e. X = Sx — Sfj2V xf*u. However, even in the 
non-linear case, the implicit function theorem provides 
the possibility to check locally whether eq. ([s]) uniquely 
defines x{u,y). For the discussion of a global version of 
this statement, see section 6 of the supplement. 
From the definition of u it follows that u{T) — 
needs to vanish at the final time point T. Hence, 
the augmented ODE system needs to satisfy two-way 
boundary conditions y(0) = yo and u(T) = 0. This fact 
constitutes a remarkable difference to the original initial 
value problem. 



Interpretation 

Starting from a dynamic system with inputs and 
measurements for both, state variables and inputs, we 
have derived differential equations for both of them. 
The original initial value problem has been transformed 
into a boundary value problem which is to be solved 
numerically. The solution trajectories Y{t\p,yo) = 
{y(t\p, yo), x{t\p, yo)) minimise the x^ functional for given 
dynamic parameters p and initial values yo- However, 
there is still notable freedom in the choice of data and 
uncertainty representations, denoted by Sy, Sx and S^, 
which decides about the meaning of the solution trajec- 
tories. 
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One possibility to define time-continuous data represen- 
tations Sy and Sx is smoothing splines. They consti- 
tute prior knowledge for each component about shape 
and time-scale of changes based solely on the measure- 
ment points. Also S'ct needs to be chosen appropriately. 
Differences between model prediction and data prior are 
usually weighted by w{t) — g ^^^^ at each time point t. 
Especially if data sampling is sparse, the data prior has 
larger uncertainty when far away from measurement time 
points. In this case, a reasonable choice of ■w{t) is given 

by 



system, 



1 



V27 



(8) 



i.e. a sum of Gaussians located around the measurement 
points. The parameter r is a measure for the correlation 
length of the data prior. 

Once data and uncertainty representations are chosen, 
the solution trajectories Y can be employed for conven- 
tional parameter estimation minimising 



Y, 



- Yf,{U\p,yo) 



(9) 



over the finite dimensional parameter space of p and 
yo- Note that the time-discrete function and the 
time-continuous functional do not coincide exactly. 
Thereby, different measures of optimality are applied 
to input functions and parameters. This difference 
is resolved in the asymptotic case of infinitely many 
measurement points. 

The distinction between parameter estimation and in- 
put reconstruction has further implications on the esti- 
mation of uncertainty bounds. Confidence intervals can 
only be assigned to the dynamic parameters and initial 
conditions. In contrast, the input becomes a usual state 
variable by construction. For state variables, the confi- 
dence region in parameter space needs to be mapped to 
state space by prediction, i.e. by evaluating the model 
for different parameter values within the confidence re- 
gion. This can e.g. be realized by parameter sampling 
using MCMC methods. Alternatively, profile likelihoods 
can be employed [5]. 



Technical remarks 

It is important for the interpretation of x{t) as a species 
concentration that x{t) > for all times t E [0,T]. This 
is not imposed a priori on the solution x(t). Rather, it 
needs to be enforced by construction, analogously to the 
state variables in the ODE of the dynamic system. This 
can be realized by the following extension of the dynamic 



y 

X 



f{y.x,p) 

-D(t)x, 



(10) 

(11) 



with a diagonal matrix D{t) = diag((ii(t), . . . , c?„i(i)) 
of new inputs di , . . . , dm ■ By construction, x can not 
change sign over time. While the prior course for D{t) 

is rather clear, i.e. d^it) = —5^, the uncertainty prior 
constitutes a subjective choice. It reflects the belief 
how fast x{t) can change at most. Besides enforcing 



positivity of the input, the extension by eq. ( 11 ) presents 



a workaround for dealing with non-linear inputs because 
the new input variables d^ enter linearly and the old 
inputs Xn become regular state variables. 

If / depends linearly on x, eq. ([s]) can be solved for x 
explicitly. This ensures computational efhciency. In the 
non-linear case, matrix inversion has to be performed 
in each evaluation step of the ODE which might slow 
down the computation of the solution remarkably. Al- 
ternatively to the introduction of new input variables. 



eq. (Ill, the computationally intensive approach can be 



avoided by a change of variables. This is possible if state 
variables and input variables factorize, i.e. if 



fij,{y,x,p) = ^gt,u{y,p)S:u +5M,o(y,p), m = i, 



(12) 



where 17^1, and g^^o do not depend on the input variables 
which have been transformed to x ^ ip(x,p) by a coor- 
dinate transformation ip. Examples could be ip{x) = x^ 



Lp(x, Kb) 



Kd+x 



for a bimolecular or an enzymatic 



reaction, respectively. The possibility of a change of 
variables covers a broad range of biologically relevant 
reaction networks. 



Although computation for linear input is remarkably 
faster than for non-linear input, it is still slower than 
solving an initial value problem. On the other hand, 
the solution of the boundary value problem is already 
optimal with regard to the input course. Therefore, 
computing time has to be compared to the time a pa- 
rameter optimization algorithm takes to estimate the 
parametrized input course. The comparison will strongly 
depend on the number of parameters that are necessary 
to parametrize the partially unknown input. So far, there 
has not been a comprehensive study comparing the two 
methods. 
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APPLICATION TO SIMULATED DATA 

The approach is apphed to the fonowing toy modeh 

X 

; (13) 

A ^ B. 

The forward reaction A — > i? is mediated by x while 
the back reaction _B — > A is unaffected by the input x. 
According to eqs. ([5][7|), the augmented ODE system for 



A, B and x is given by 

A^-kiAx + k2B (14) 

B = kiAx - k2B (15) 

iiA ^ kix{uA - ub) - ^-^r-^ (16) 

ub = -hiuA- ub) ^ — - (17) 



° B 

with the auxihary state variables ua, ub, the data 
representations Sa, Sb and the uncertainty represen- 
tations 5*^2 and S„2 . The input x is related to the 
other state variables by a; = S'^^ + S„2kiA(uA — ub)- 
Several input functions x have been chosen for data 
generation, among them an exponential decay, x ~ e~"*, 
an activation dynamics with a slow decay after a fast 
increase, x ~ e~"' — e~^* with a > (3, and a Gaussian 
_(*-*dip)^ 

input, x ~ , „ e 2^ . The example is numeri- 
cally implemented in C and in R [6 . Optimization is 
performed by a Gauss-Newton algorithm for nonlinear 
least-squares estimation. 

The purpose of this section is to compare parameter 
estimation for the variational and the fixed input ap- 
proach. The input data prior, i.e. the smoothing spline 
through the simulated input data points, is employed as 
input function for the fixed input approach. 

Examples with Gaussian input are depicted in Fig- 
ures [T] and |2] All components. A, B, and x depicted in 
Figure [l]A_-B have been measured at 20 time points. In 
this case of dense sampling, the data priors, charted as 
dotted lines in Figure [TJ\, come close to the estimated 
time-courses, charted as dotted lines in Figure [Tj3. 
This is reflected in the distributions of the parameter 
estimates in Figure [ip: for the same set-up, 1000 noise 
realizations have been generated and the variational 
approach has been used for parameter estimation. 
In order to compare the result with the fixed input 
approach, the data prior of x has been employed as 
input and conventional parameter estimation has been 
performed. Hence, in the setting of dense sampling and 
small noise, both estimation approaches perform equally 
in terms of accuracy and precision. 




1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.2 2.3 2.4 2.5 2.6 

k2 BO 



FIG. 1; (A-B) Simulated data for the species A, B and the 
input X. True time courses are denoted by continuous lines. 
Data points are subject to Gaussian noise with a — 0.1. (A) 
Data representations are indicated as dashed lines, (B) so- 
lution trajectories after parameter estimation are shown as 
dashed lines. (C) Comparison of parameter distributions ob- 
tained from 1000 repetitions of data generation and parameter 
estimation for the variational and the fixed input approach. 



A rather different situation is depicted in Figure [2]A.- 
B. The input x is measured at four time points only, 
leading to a poor data prior shown as green dotted 
line in Figure [2]A.. Like before, the species A and B 
have been measured at 20 time points. Most of the 
information about the dynamics of the input is encoded 
in these measurements. The correlation time r has been 
chosen to be much smaller than the distance between 
time points allowing for much interstitial variability. 
The resulting trajectories Y after parameter estimation 
are shown as dotted lines in Figure [2|3. The true input 
curve is reconstructed almost entirely. The noticeable 
fluctuations are caused by coincidental noise correlations 
between species A and B: simultaneous deviations from 
the true course in opposite directions lead to immediate 
breakouts of the reconstructed input. 
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Also for this set-up, 1000 noise realizations have been 
generated for the comparison of the variational and 
the fixed input approach. The parameter and initial 
value distributions for both approaches are shown in 
Figure[2p. Since the true input can be reconstructed, the 
variational approach is able to estimate all parameters 
accurately. In contrast, when the input is fixed to the 
apparent input data prior, parameter estimation leads 
to biased parameter estimates. 




— variational — fixed input 




1 2 3 4 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 

k2 BO 



FIG. 2: (A-B) Input reconstruction - simulated data for the 
species A, B and the input x. True time courses are denoted 
by continuous lines. Data points are subject to Gaussian noise 
with a = 0.1. For the input, only 4 data points are provided. 
(A) Data representations are indicated as dashed lines, (B) 
solution trajectories after parameter estimation are shown as 
dashed lines. (C) Comparison of parameter distributions ob- 
tained from 1000 repetitions of data generation and parameter 
estimation for the variational and the fixed input approach. 

Finally, we investigated the coverage probability f7^ of 
the confidence intervals derived from the variational and 
the fixed input approach: for each simulated data set, 
parameter estimation is performed, confidence intervals 
are computed and the information if the true parameter 



value is situated within the 68% /90% confidence interval 
is collected. This information is cumulated over many 
runs of data generation. 

Figures |3]4. and [3j3 show the results for Gaussian input 
with 20 input measurements and 4 input measurements 
respectively. In each case, 20 measurement points have 
been provided for each of the species A and B. 
For both estimation approaches, confidence intervals of 
estimated parameters and initial values have been pro- 
duced by means of the profile likelihood approach |H] with 
respect to eq. (|9|. 

For the set-up with 20 input measurement points, both 
estimation approaches provide accurate estimators with 
similar variances as confirmed by Figure [l] However, as 
Figure |3] shows, the coverage differs significantly between 
the two approaches. Confidence intervals for ki and ^2 
are systematically underestimated by the fixed input ap- 
proach. The variational approach in contrast is able to 
correctly take the degrees of freedom in the input into 
account. Thus, the coverage is close to the expected val- 
ues. 

For the set-up with 4 input measurement points, the 
variational approach performs significantly better than 
the fixed input approach with respect to coverage. How- 
ever, also the variational approach produces confidence 
intervals that are slightly too small for the dynamic pa- 
rameters fci and fc2. Figure [3j3 left, and too small for 
the estimated initial values. Figure [3j3 right. The reason 
for this behaviour is a combination of the small correla- 
tion length T and the objective function given by eq. (|9|. 
Short values of r allow that the input function has fast 
fluctuations. Especially around the input measurement 
points, function values tend to punctually approach the 
measured values, favoured by the time-discrete objective 
function. Since these fluctuations occur at a short time 
scale, it has little influence on the course of A and B and 
thereby, estimation of the dynamic parameters is almost 
unaffected. 

This case shows that r needs to be chosen appropriately 
for the problem: small for comprehensive input recon- 
struction and larger for propagation of uncertainties. A 
second possibility would be to adapt statistical results for 
conventional parameter estimation to the case of time- 
continuous objective functions. 

CONCLUSION 

In many applications, it is difficult to guess a proper 
input model because input data is not available or too 
noisy. Instead of parametrizing the input, we employed 
variational calculus to transform the ODE into an 
augmented system of ODEs describing the original and 
the input components. The solution of this system 
minimises the functional which plays a central role 
and is directly associated to the objective function of 
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FIG. 3: Coverage for variational approach (blue) and fixed input approach (red). Continuous lines correspond to a 90% coverage 
probability, dashed lines to 68%. Both probabilities are indicated as black dashed horizontal lines. (A) Example with 20 input 
measurement points, (B) example with 4 input measurement points. 



the original estimation problem. Since the extension 
of the function to the functional is not unique, 
the new functions, i.e. continuous data and uncertainty 
representations, need to be chosen intentionally. To 
this end we propose smoothing splines that have a 
concrete interpretation as data priors. Especially in the 
case of sparse sampling we propose to use weighting 
functions for the uncertainties. By this means, existing 
measurement points are taken into account and the 
course between time points is not excessively constrained 
by the data prior. 

In the field of control theory and optimal control, so 
called cost functional take the role of our functional. 
Once chosen the appropriate x^ functional, our approach 
to input estimation can be embedded in the general 
framework of Pontryagin's minimum principle [9] and 
eqs. (6|7| can be identified as a Hamiltonian system. 



We showed that our combined variational approach 
to parameter estimation enables the assembly of all 
information present in species and input measurements. 
By this means, it accounts properly for variability 
in the input due to measurement uncertainties and 
produces correct confidence bounds. Depending on 
the situation, the combination of all information leads 
to comprehensive reconstruction of the input curves. 
Information about the dynamics of the input can be 
concentrated in the species measurements like Figure [l] 



shows. In such cases our approach clearly outperforms 
conventional approaches. The variational method is even 
applicable if no input measurements are available or if 
species are partially unobserved. A prominent example 
where the presented method could be applied is the 
PI3K/AKT/mT0R pathway [10]. Even though various 
mTOR complexes and their phosphorylated states can 
be measured, it is not clear how they mediate AKT 
activation. By applying the variational method to AKT 
data, it would be possible to reconstruct the required 
mediator and subsequently relate it to mTOR complex 
measurements. 

A completely different field of application is network 
modularization. The entire network can be dissected 
preferably at nodes where measurements are available. 
These nodes are then treated as independent inputs thus 
disentangling the network. In this way, the number of 
equations the variational approach has to deal with is 
kept small and computational efficiency is ensured. 

A further step after the introduction of a time- 
continuous objective function would be to use the same 
function for parameter estimation. The time-continuous 
version of the objective function is closely related to the 
original function. Therefore, we are confident that it is 
possible to endow the time-continuous objective function 
with statistical meaning. This would not only allow for 
employing the same objective for parameter estimation 



7 



and input reconstruction in our application. It would 
also enable the transfer of many more results from con- 
trol theory and make it suitable for statistical inference. 
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